Picture for Simon S. Du

Simon S. Du

Frank

Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback

Add code
Dec 31, 2025
Viaarxiv icon

Understanding the Gain from Data Filtering in Multimodal Contrastive Learning

Add code
Dec 16, 2025
Figure 1 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 2 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 3 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Figure 4 for Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Viaarxiv icon

Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs

Add code
Jun 06, 2025
Viaarxiv icon

Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures

Add code
Jun 06, 2025
Viaarxiv icon

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Add code
May 26, 2025
Viaarxiv icon

Improving Human-AI Coordination through Adversarial Training and Generative Models

Add code
Apr 21, 2025
Viaarxiv icon

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Add code
Apr 20, 2025
Figure 1 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 2 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 3 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 4 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Viaarxiv icon

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback

Add code
Mar 11, 2025
Viaarxiv icon

Anytime Acceleration of Gradient Descent

Add code
Nov 26, 2024
Figure 1 for Anytime Acceleration of Gradient Descent
Figure 2 for Anytime Acceleration of Gradient Descent
Viaarxiv icon

Learning to Cooperate with Humans using Generative Agents

Add code
Nov 21, 2024
Viaarxiv icon